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Abstract 

We characterize the squares occurring in infinite overlap-free binary words and 
construct various a power-free binary words containing infinitely many overlaps. 

1 Introduction 

If a is a rational number, a word w is an a 'power if there exists words x and x', with x' 
a prefix of x, such that w = x'^x' and a = n + We refer to |a;| as a period of w. 

An power is a word that is a /? power for some j3 > a. A word is a power-free (resp. 
a"*" power-free) if none of its subwords is an a power (resp. power). A 2 power is called 
a square; a 2+ power is called an overlap. 

Thue jT7j constructed an infinite overlap-free binary word; however, Dekking [7j showed 
that any such infinite word must contain arbitrarily large squares. Shelton and Soni JH] 
characterized the overlap-free squares, but it is not hard to show that there are some overlap- 
free squares, such as 00110011, that cannot occur in an infinite overlap-free binary word. In 
this paper, we characterize those overlap-free squares that do occur in infinite overlap-free 
binary words. 
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Shur jT^ considered the bi-infinite overlap-free and 7/3 power-free binary words and 
showed that these classes of words were identical. There have been several subsequent 
papers Pfl IHl CSl 1^ that have shown various similarities between the classes of overlap-free 
binary words and 7/3 power- free binary words. Here we contrast the two classes of words by 
showing that there exist one-sided infinite 7/3 power-free binary words containing infinitely 
many overlaps. More generally, we show that for any real number a > 2 there exists a real 
number (3 arbitrarily close to a such that there exists an infinite (3^ power-free binary word 
containing infinitely many (3 powers. 

All binary words considered in the sequel will be over the alphabet {0, 1}. We therefore 
use the notation w to denote the binary complement of w; that is, the word obtained from 
w by replacing with 1 and 1 with 0. 

2 Properties of the Thue-Morse morphism 

In this section we present some useful properties of the Thue-Morse morphism] i.e., the 
morphism /i defined by /i(0) = 01 and /i(l) = 10. It is well-known pTj ITT] that the Thue- 
Morse word 

t = /x'^(O) = 0110100110010110 • • • 

is overlap- free. 

The following property of /i is easy to verify. 

Lemma 1. Let x and y be binary words. Then x is a prefix (resp. suffix) of y if and only 
if fi{x) is a prefix (resp. suffix) of fi{y). 

Shur proved the following useful theorem. 

Theorem 2 (Shur). Let w be a binary word and let a > 2 be a real number. Then w is 
a power-free if and only if fi{w) is a power-free. 

The following sharper version of one direction of this theorem (implicit in is also 
useful. 

Theorem 3. Suppose ^{w) contains a subword u of period p, with \u\/p > 2. Then w 
contains a subword v of length \\u\/2'] and period p/2. 

Karhumaki and Shallit |3 gave the following generalization of the factorization theorem 
of Restivo and Salemi [T^. The extension to infinite words is clear. 

Theorem 4 (Karhumaki and Shallit). Let x G {0,1}* be a power-free, 2 < a < 7/3. 
Then there exist u,v E {e, 0, 1, 00, 11} and an a power-free y G {0, 1}* such that x = ufi{y)v. 
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3 Overlap- free squares 

Let 

A = {00,11,010010,101101} 

and let 

A=[j^^\A). 

k>0 

Pansiot ^2] and Brlek [S] gave the following characterization of the squares in t. 

Theorem 5 (Pansiot; Brlek). The set of squares in t is exactly the set A. 

We can use this result to prove the following. 

Proposition 6. For any position i, there is at most one square in t beginning at position i. 

Proof. Suppose to the contrary that there exist distinct squares x and y that begin at position 
i. Without loss of generality, suppose that x and y begin with 0. Then by Theorem 
X = fi^{u) and y = fi'^{v), for some p,q and u,v E {00,010010}. Suppose p < q and let 
w = fi'^~^{v). By Lemma either u is a proper prefix of w or is a proper prefix of u, 
neither of which is possible for any choice of u,v E {00, 010010}. □ 

The set A does not contain all possible overlap-free squares. Shelton and Soni [inj 
characterized the overlap-free squares (the result is also attributed to Thue in ^). 

Theorem 7 (Shelton and Soni). The overlap-free binary squares are the conjugates of 
the words in A. 

Some overlap-free squares cannot occur in any infinite overlap-free binary word, as the 
following lemma shows. 

Lemma 8. Let x = ^^{z) for some k > and z G {011011, 100100}. Then xa contains an 
overlap for all a G {0, 1}. 

Proof. It is easy to see that x = uvvuvv for some u,v E {0, 1}*, where u and v begin with 
different letters. Thus one of uvvuvva or vva is an overlap. □ 

We can characterize the squares that can occur in an infinite overlap-free binary word. 

Let 

B = {001001,110110} 

and let 

k>0 

Theorem 9. The set of squares that can occur in an infinite overlap-free binary word is 
AU B. Furthermore, if w is an infinite overlap-free binary word containing a subword 
X E B, then w begins with x and there are no other occurrences of x in w. 
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Proof. Let w be an infinite overlap-free binary word beginning witli a square yy ^ AU B. 
Suppose furtlier tliat yy is a smallest such square that can be extended to an infinite overlap- 
free word. If \y\ < 3, then yy ^ AU B is one of 011011 or 100100, neither of which can be 
extended to an infinite overlap- free word by Lemma |H1 

We assume then that \y\ > 3. Since, by TheoremQ yy is a conjugate of a word in A, we 
have two cases. 

Case 1: yy = fi{zz) for some z G {0, 1}*. By Theorem EJ w = fi{zzw') for some infinite 
w', where zzw' is overlap- free. Thus zz is a smaller square not in AUB that can be extended 
to an infinite overlap-free word, contrary to our assumption. 

Case 2: yy = afj,{zz')a for some a G {0,1} and z,z' G {0,1}*. By Theorem HI yy is 
followed by a in w, and so yya is an overlap, contrary to our assumption. 

Since both cases lead to a contradiction, our assumption that yy ^ AU B must be false. 

To see that each word in ^ U i3 does occur in some infinite overlap- free binary word, note 
that AUouche, Currie, and Shallit j2| have shown that the word s = OOlOOlt is overlap-free. 
Now consider the words yu'^(s) and /i^(s), which are overlap- free for all /c > 0. 

Finally, to see that any occurrence of x G i3 in w must occur at the beginning of w, we 
note that by an argument similar to that used in Lemma |Hl ax contains an overlap for all 
a G {0, 1}, and so x occurs at the beginning of w. □ 



4 Words containing infinitely many overlaps 

In this section we construct various infinite a power-free binary words containing infinitely 
many overlaps. We begin by considering the infinite 7/3 power- free binary words. 

Proposition 10. For all p > 1, an infinite 7/3 power-free word contains only finitely many 
occurrences of overlaps with period p. 

Proof. Let x be an infinite 7/3 power-free word containing infinitely many overlaps with 
period p. Let k > he the smallest integer satisfying p < 3-2^. Suppose x contains an 
overlap w with period p starting in a position > 2^~^^. Then by Theorem 0) we can write 

X = Uin{u2) ■ ■ ■/i''"^(ufc)/i''(y), 

where each Ui G {e, 0, 1, 00, 11}. The overlap w occurs as a subword of /i'^(y). By LemmaEl 
y contains an overlap with period p/2'' < 3. But any overlap with period < 3 contains a 
7/3 power. Thus, x contains a 7/3 power, a contradiction. □ 

The following theorem provides a striking contrast to Shur's result that the bi-infinite 
7/3 power-free words are overlap-free. 

Theorem 11. There exists a 7/3 power-free binary word containing infinitely many overlaps. 
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Proof. We define the following sequence of words: Aq = 00 and An+i = 0/i^(A„), n > 0. 
The first few terms in this sequence are 

Ao = 00 

Ai = 001100110 

A2 = 0011001101001100101100110100110010110 

We first show that in the limit as n 00, this sequence converges to an infinite word 
a. It suffices to show that for all n, An is a prefix of An+i- We proceed by induction on 
n. Certainly, Aq = 00 is a prefix of A^ = 0/i2(00) = 001100110. Now A„ = 0/i2(A„_i), 
An^i = 0/i^(A„), and by induction, is a prefix of An- Applying LemmalU we see that 

An is a prefix of An+i, as required. 

Note that for all n, An+i contains fi'^^{Ai) as a subword. Since Ai is an overlap with 
period 4, /i,^"(y4i) contains 2^" overlaps with period 2^""*"^. Thus, a contains infinitely many 
overlaps. 

We must show that a does not contain a 7/3 power. It suffices to show that An does not 
contain a 7/3 power for all n > 0. Again, we proceed by induction on n. Clearly, Aq = 00 
does not contain a 7/3 power. Consider An+i = 0/i^(y4„). By induction, A„ is 7/3 power- 
free, and by Theorem|2l so is Thus, if An+i contains a 7/3 power, such a 7/3 power 
must occur as a prefix of An+i. Note that An+i begins with 00110011. The word 00110011 
cannot occur anywhere else in An+i, as that would imply that An+i contained a cube 000 
or 111, or the 5/2 power 1001100110. If An+i were to begin with a 7/3 power with period 
> 8, it would contain two occurrences of 00110011, contradicting our earlier observation. 
We conclude that the period of any such 7/3 power is less than 8. Checking that no such 
7/3 power exists is now a finite check and is left to the reader. □ 

In fact, we can prove the following stronger statement. 

Theorem 12. There exist uncountably many 7/3 power-free binary words containing in- 
finitely many overlaps. 

Proof. For a finite binary sequence b, we define an operator gj, on binary words recursively 
by 

geiw) = w 

9ob{w) = i^igbiw)) 

gibiw) = OiJ,'^{gb{w)). 

Note that ^'^(O) always starts with a 0, so that for any finite binary words p and b, gp{0) is 
always a prefix of gpb{0). Since go{0) is not a prefix of gi{0), gpo{0) is not a prefix of 5'pi(0) for 
any p, so that distinct b give distinct words. Given an infinite binary sequence b = ■ ■ • 

where the bi G {0, 1}, define an infinite binary sequence w\y to be the limit of 

g,m,gb,m,9b,b,m,9b^b,b,m,--- 
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By an earlier argument, each is 7/3 power-free. Since gi{00) = 001100110 is an 
overlap, 5'fei(00) = (^^(OOllOOllO) ends with an overlap for any finite word b. Thus, each 1 in 
b introduces an overlap in w\j. Since uncountably many binary sequences contain infinitely 
many I's, uncountably many of the wy, are 7/3 power-free words containing infinitely many 
overlaps. □ 

Next, we show that the sequence a constructed in the proof of Theorem^Jis an automatic 
sequence (in the sense of 

Proposition 13. The sequence a is A-automatic. 

Proof. We show that a = g{h'^{0)), where h and g are the morphisms defined by 



h{0) = 0134 ^(0) = 

h{l) = 2134 g{l) = 

h{2) = 3234 g{2) = 

h{3) = 2321 ^(3) = 1 

/i(4) = 3421 ^(4) = 1. 

We make some observations concerning 2-letter subwords: The sequence h'^{0) clearly 
does not contain any of the words 11, 14, 22, 24, 31, 33, 41 or 44. In fact, neither 12 nor 43 
appears as a subword either: Words 12 and 43 do not appear internally in h{i), < i < 4; 
therefore, if 43 appears in /i"(0), it must 'cross the boundary' in one of h{12), /i(14), h{22) 
or h{24). Since 14, 22 and 24 do not appear in h'^{0), word 43 can only appear in /i"(0) as a 
descendant of a subword 12 in /i"~^(0). However, the situation is symmetrical; word 12 can 
only appear in h'^[0) as a descendant of a subword 43 in /i"~^(0). By induction, neither 43 
nor 12 ever appears. 

The point of the previous paragraph is that 

h{0) always occurs in the context h{0)2 
h{l) always occurs in the context h{l)2 
h{2) always occurs in the context h(2)2 
h{3) always occurs in the context h{3)3 
h{A) always occurs in the context h{A)3 

The word h'^{0) can thus be parsed in terms of a new morphism /: 

/(O) = 1342 

/(I) = 1342 

/(2) = 2342 

/(3) = 3213 

/(4) = 4213. 
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The parsing in terms of / works as follows: If we write h'^{0) = Ow, then w = f{Ow). It 
is useful to rewrite this relation in terms of the finite words h"'{0). For non- negative integer 
n let Xn be the unique letter such that h"-{0)xn is a prefix of /i'^(O). Thus Xq = 1, Xi = 2, 
etc. We then have 

h^iO)xn = 0/(/i"-^(0)), n>l. (1) 

Since for all a G {0,1,2,3,4}, g{f{a)) = fi^{g{a)), we have g{f{u)) = jJ^{g{u)) for all 
words u. Therefore, applying g to ^ 

g{h-{0)x^) = g{Of{h^~\0))) 

= g{0)g{f{h-\0))) 

= 0^,\g{h--\0))), n>l. 

From this relation we show by induction that An is the prefix of g{h"'^^{0)) of length 
(4^+1 + 3 . 4" - l)/3. Certainly, Aq = 00 is the prefix of length 2 of g{h{0)) = 0011. Consider 
An = 0/x^(yl„„i). We can assume inductively that An-i is the prefix of g{h"-{0)) of length 
(4" + 3 • 4"-^ - l)/3. Writing ^(/i"(0)) = An-iz for some z, we have 

gih-+\0)xn+i) = Qi^\g{h-m) 

= Ofl^An^lZ) 
= Anfi^{z), 

for some whence An is a prefix of g{h^^^{0)). Since \An\ = 4:\An-i\ + 1, we have 

\An\ = (4"+i + 3 ■ 4" - l)/3, as required. □ 

The result of Theorem ^2 can be strengthened even further. 

Theorem 14. For every real number a > 2 there exists a real number j3 arbitrarily close 
to a, such that there is an infinite (3^ power-free binary word containing infinitely many 
P powers. 

Proof. Let s > 3 be a positive integer, and let r = [a + Ij . Let t be the largest positive 
integer such that r — t/2^ > a, and such that the word obtained by removing a prefix of 
length t from /i''(0) begins with 00. Let f3 = r — 1/2^. Since a > r — 1, we have t < 2*. Also, 
fi^{0) = 01101001 and fi^{l) = 10010110 are of length 8, and both contain 00 as a subword; 
it follows that |a — /5| < 8/2'', so that by choosing large enough s, /? can be made arbitrarily 
close to a. 

We construct sequences of words An, Bn and C„. Define Co = 00. For each n > 0: 

1. Let An = O^'^^Cn. 

2. Let Bn = fi'{An). 

3. Remove the first t letters from 5„ to obtain a new word C„+i beginning with 00. 
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Since each An begins with the r power C, each i?„ = begins with an r power of 

period 2**. Removing the first t letters ensures that Cn+i commences with an (r2* — t)/2^ 
power, viz., a /3 power. The hmit of the C„ gives the desired infinite word. Let us check that 
this hmit exists: 

Let w be the word consisting of the first t letters of fi^{0). Since all the An commence 
with by construction, all the -B„ commence with fi''^{0), and hence with w. This means 
that Bn = wCn+i for each n. 

We show that An is always a prefix of An+i by induction. Certainly Aq is a prefix of Ai. 
Assume that An-i is a prefix of An- Since An = 0^~^C„ and An+i = 0^~'^Cn+i, An is a prefix 
of An+i if Cn is a prefix of Cn+i- Since -Bn-i = "W^Cn and Bn = wCn+i, Cn is a prefix of C„+i 
if B„ _i is a prefix of Bn- By Lemma ^ -Bn-i is a prefix of if is a prefix of An, which 
is our inductive assumption. We conclude that An is a prefix of An+i- 

It follows that Cn is a prefix of C„+i for n > 0, so that the limit of the C„ exists. It will 
thus suffice to prove the following claim: 

Claim: The An, Bn and C„ satisfy the following: 

1. The word C„ contains no (3^ powers. 

2. The only /3+ power in An is C. 

3. Any /5+ powers in Bn appear only in the prefix fi^{0^). 

Certainly Cq contains no (3'^ powers, and since f3 > r — 1, the only f3'^ power in Aq is 0^. 
Suppose then that the claim holds for An and C„. 

Now suppose that i?„ = ^^{0^~'^)^^{Cn) contains a /3+ power u with period p. Since 
Cn contains no powers. Theorem |21 ensures that ^^{Cn) contains no /3+ powers. We can 
therefore write Bn = xuy where |x| < |/i''(0'"~^)|. In other words, u overlaps /i*(0^~^) from 
the right. By Theorem El the preimage of Bn under /i, i.e., yu''~^(y4„), contains a [3^ power 
of length at least \u\/2 and period p/2. In fact, iterating this argument. An contains a (3^ 
power of period p/2* of length at least \u\/2^ . Since the only (3^ power in An is O'', with 
period 1, we see that p/2^ = 1, whence p = 2** and \u\ < r2*. 

Recall that Bn has a prefix /i*(0'') which also has period 2^*, and that this prefix is 
overlapped by u. It follows that all of xu is a f3~^ power with period p = 2^*. However, as just 
argued, this means that \xu\ < r2* = |/i*(0'')|, so that u is contained in /i'*(0'') and part 3 of 
our claim holds for Bn- We now show that parts 1 and 2 hold for Cn+i and An+i respectively, 
and the truth of our claim will follow by induction. 

Part 1 follows immediately from part 3. 

Now suppose that An+i contains a (3~^ power u. Recall that An+i = C^^C^+i, and Cn+i 
begins with 00, but contains no powers. It follows that u is not a subword of Cn+i- 
Therefore, 000 must be a prefix of u. If m = 0'^ for some integer q, then g < r by the 
construction of An+i, and 

r>q>f3>a>r — 1. 

This implies that q = r, and u = 0'" , as claimed. If we cannot write u = 0'^, then ImIi > 1. 
Because m is a 2"*" power, 000 must appear twice in u with a 1 lying somewhere between the 



8 



two appearances. This implies that 000 is a subword of Cn+i, and hence of Bn = 
However, no word of the form fi{w) contains 000. This is a contradiction. □ 

We conclude by presenting the following open problem. 

Does there exist a characterization (in the sense of j31IH]) of the infinite 7/3 power- 
free binary words? 
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